Confirmatory Factor Analyses in Psychological Test Adaptation and Development
نویسندگان
چکیده
Open AccessConfirmatory Factor Analyses in Psychological Test Adaptation and DevelopmentA Nontechnical Discussion of the WLSMV EstimatorKay Brauer, Jochen Ranger, Matthias ZieglerKay BrauerKay Department Psychology, Martin Luther University Halle-WittenbergEmil-Abderhalden?Str. 26-27, 06099 Halle, Germany, [email protected]://orcid.org/0000-0002-7398-8457Department Halle-Wittenberg, GermanySearch for more papers by this author, Rangerhttps://orcid.org/0000-0001-5110-1213Department Zieglerhttps://orcid.org/0000-0003-4994-9519Department Humboldt-Universität zu Berlin, authorPublished Online:February 07, 2023https://doi.org/10.1027/2698-1866/a000034PDF ToolsAdd to favoritesDownload CitationsTrack Citations ShareShare onFacebookTwitterLinkedInReddit SectionsMoreThe importance providing structural validity evidence test score(s) derived from psychometric instruments is highlighted several institutions; example, American Association (2014) demands that an instruments’ internal structure its underlying measurement model must be provided before it applied psychological assessment. The knowledge about latent data obtained with tests addressing major question “What is/are construct[s] being measured” under investigation (Ziegler, 2014, 2020). study typically addressed factor analyses when scores reflect continuous traits. As most submissions Development (PTAD) deal adaptation further development existing measures, authors a based on theoretical considerations prior findings original versions (or adaptations) investigation. Our literature review PTAD’s publications showed than 90% articles contain at least one confirmatory analysis (CFA).As editor reviewers PTAD, we appreciate are rigorous their tests’ data. However, since inception 2019, experience comment frequently communicated during process, namely, request adjust analytic approach CFA maximum likelihood (ML) estimation toward using mean- variance-adjusted weighted squares (WLSMV; Muthén et al., 1997) estimator account ordinal nature generate item level. In editorial, discuss rationale behind choosing analyzing adaptations developments categorical concisely illustrate problems associated ML (potentially combination robust fit) such data.A Short Recap Basic Confirmatory Analysis PrinciplesCFA aims testing predefined assumption (e.g., items). construction evaluation, each score items reflecting trait (Ziegler & Hagemann, 2015). particular, contains hypothesis which manifest indicators (i.e., items) should loaded factor(s). relations between variables laws Cronbach Meehl (1955) list as part nomological net. This net also includes variables, could tested CFA. short, allows whether priori assumed fits observed mentioned, PTAD rely provides assumptions regarding dimensionality number factors) item-factor assignment.For illustration, might imagine want examine model) translation self-report questionnaire consists 10 items. Let us assume empirical previous studies suggest two factors. For Items 1–5 I 6–10 II. We specify accordingly displayed Figure 1, loadings > 0 allowed At same time, unintended restricted II set zero; thinned lines 1). When reflective traits, factors cause intercorrelations among respective factor. Thus, if correct, would expect high 1–5, they share factor, showing no substantial 6–10, because latter represents distinct On other hand, 6 (Factor II), turn not substantially correlated 11–5.Figure 1 Note. Thinned paths represent zero. Residuals simplicity. Double arrow indicates interfactor correlation.CFA examines our hypothesized implications covariance (the implied fit responses structure). data, interested magnitude direction loading parameters overall fit.Model usually estimated estimator. estimation, determines those parameter values (loadings, specific variances, intercorrelations) make matrix close possible matrix. defined via discrepancy function assesses agreement zero both matrices identical greater otherwise. smaller discrepancy, better does explain structure. can interpreted given (Lawley Maxwell, 1962). Hence, minimizing likely considering specified model. Provided holds, consistent. efficient normally distributed case sample covariances sufficient statistics. Sufficient statistics summarize without loss information. More technically, only aspect relevant function, mean ignored.The degree model-to-data evaluated basis goodness-of-fit indexes inform absolute ?2 value root-mean-square error approximation) relative alternative models; e.g., Tucker–Lewis index comparative index). These allow gauging how well reproduce variance–covariance cutoff Hu Bentler, 1999; see Greiff Heene, 2017; Heene 2011; Hopwood Donnellan, 2010, critical discussion). good fit, conclude well, exceeding intended while factor.1 contrary, would, models or modification revise accordingly, then revised independent avoid overfitting single (Fokkema Greiff, 2017).While description recapitulates general proceeding interpreting CFA, noted accuracy estimations SEs choice method. One guiding principle best-suited met type analyzed.Estimating Model Basis Continuous Discrete DataMaximum Likelihood Robust Maximum EstimationThe default numerous standard statistical software packages. consistent long mild regularity conditions hold. include following:•The correctly specified, generated model.•The have distributions.•The size large enough consistent.•The identifiable, meaning accurately data.The accurate estimates many scenarios dealing continuously multivariate normal data). An example indicator response visual analog scale, where takers place continuum poles 0% 100%; Flynn 2004). has limitations do follow distribution, ordered data.Categorical answered categories, dichotomous ability scored [incorrect] [correct]; Gnambs 2021) rating scales containing 3 k options, anchors (strongly disagree) agree; Dierickx Many assessment psychology neighboring sciences discrete (Simms 2019). Likert (1932) proposed use five although he did provide explanation point view choice, adopted suggestion Accordingly, nature.As noted, merits, asymptotically unbiased, consistent, efficient. these attributes hold certain listed above. responses, thus, analyzed, distribution (see, Bollen, 1989). violated collected categories few options.To distributions were N = 540 participants who responded assessing extraversion. Each participant four times, 2-, 4-, 6-, 8-point scale end apply applies very much, respectively.2 overview used options (cf. Simms 2 shows frequency 8-response option item. figure nicely portrays resulting options. Of course, especially visible checking 2-point 4-point (upper half 2), frequencies realization events indicate chose k. affected actual difficulty, empirically affects responses. additional concern realizations representing all limited. rarely chosen, even multiple categories: 3.7% (n 20) responding 4?point 1.3% 7) 6?point format, 0.9% 5) 8?point format. some under?represented.Figure Histograms extraversion scales.The goes along misspecification words, popular field but discrete. important required satisfied consequence, loadings, SEs, potentially biased Beauducel Herzberg, 2006; Kaplan, 2009; Li, 2016). base interpretation inaccurate estimates, increases making erroneous conclusions Thereby, erroneous, too.The (MLR) been introduced alternative. MLR eases normality (Bollen, brief, regular uses corrections statistic (for details, see, Chou 1991; Satorra 1994; Yuan 1998). While normality, still requires continuous, suitability producing debatable. comprehensive simulation study, Bandalos concluded considered viable recommends another estimator.WLSMV produced (Muthén 1997). result categorization describes respond item: It level defines range Whenever within range, corresponding chosen. will process simulated (denoted x1 x2) 500 takers. might, amount personality questionnaire.Figure categorized Dots ?11 ?22 denote thresholds employed into three options: SA strongly agree, neutral, SD disagree.When questionnaire, takers’ mapped follows. 3, labels define continuum. disagree, agree). falls labels, category taker’s lies ?1.5 ?0.5 continuum, neutral ranges denoted Greek letter ?. There ? thresholds, defining ?12 Item x1, ?21 x2. All below choose first disagree x1. Those above third way, x2 chosen location relation ?22. note scatterplot cannot directly. Instead, tabulated Table gives cross-tabulation 3.Table Cross-tabulation x2ItemResponsex1Strongly disagreeNeutralStrongly agreeNote. responses.x2Strongly disagree3587222Neutral015104Strongly agree0334Table x2View image HTML Based estimate compare correlations dimensions answers correlation r .24, whereas .49. lower (z 4.59, p < .001).In general, covariances) valid problematic converge true estimator, however, recovering cross-tabulations (see 1997, technical details). determine table frequencies, issue noncontinuous Such tetrachoric binary polychoric categories. recovered proceeds determining similar (which one). Similarity again assessed function. WLMSV differs fact distributed. core, sum squared differences correlations. increase efficiency al, complemented parallel ones estimation.In depend used. infer With regard distributional assumptions, assumes line majority constructs studied (Li, 2016), specifically designed analyze mostly measures.Comparisons Performance ML, MLR, WLSMVThe CFAs draw validity. aimed nontechnical understanding issues applying estimators discussed data.Simulation correlations, errors, convergence rates, size, complexity, non-normal distributions, estimators. shortly compared focused here.Beauducel Herzberg (2006) examined performance 2, 4, 5, samples 250, 500, 750, 1,000 respondents. They fixed .50 (oblique models) .55 (orthogonal found outperformed options) indexes. Moreover, is, underestimated yielded higher errors comparison irrespective initial systematic estimators, simulations consider approximatively characterized skewness and/or kurtosis), particular categories.To knowledge, Li (2016) date, comparing (MLR WLSMV). sizes 200, participants, slightly moderately sets, concerning 6, 8, loadings), sizes, characteristics. overestimated small ? 200) deviates MLR. Standard sensitive alike, (N 200), negligible larger samples. Finally, tends over-reject Taking together, concludes cautiously due bias errors. extended free restrictions limit interpretations. Most importantly, .70 item, uncharacteristically items.When combing studies, yielding ML-based estimators.Recommendation ConclusionMLR advantages disadvantages. there one-solution-fits-all recommendation, depends what researchers From experience, evaluating models. Considering current favor focus estimating interrelations cases. study’s main aim hypotheses instead primarily favored.One argue fruitful approaches transparently reporting across methods. ML(R) overlap comparatively analyzed 2006). converge, Li’s hints guide features responsible contribute expand future research factorial contextualizing adaptations.Finally, potential investigate measures Alternatively, theory (IRT) offers (Bond 2020; van der Linden, fact, shown made graded (Takane de Leeuw, 1987). models, ideally full information approaches, (Forero Maydeu-Olivares, 2009). complexity IRT often increased computational power time high-dimensional nonlinearly depending test. encourage submitting classical framework.In conclusion, discussion observation receives nonrobust speculate, reason packages Mplus Muthén, 1997–2017), AMOS (Arbuckle, 2019), CRAN R’s lavaan (Rosseel, 2012) preset method conveniently. certainty treating approximately ignoring recent seems through hope contributes why editors working CFAs.We grateful Rebekka Sendatzki her help preparing figures.ReferencesAmerican Association. (2014). Standards educational testing. AERA Publications. First citation articleGoogle ScholarArbuckle, J. L. (2019). Amos (Version 26.0) [Computer program]. IBM SPSS. ScholarBandalos, D. Relative diagonally estimation. Structural Equation Modeling, 21(1), 102–116. 10.1080/10705511.2014.859510 articleCrossref, Google ScholarBeauducel, A., P. Y. (2006). versus means variance adjusted 13(2), 186–203. 10.1207/s15328007sem1302_2 ScholarBollen, K. A. (1989). equations variables. Wiley. ScholarBond, T., Yan, Z., M. (2020). Applying Rasch model: Fundamental human sciences. Routledge. ScholarBrauer, K., Nussbeck, F. J., Zwiky, E., Proyer, R. T. (2022). Testing effects interpersonal perception [Manuscript preparation]. ScholarChou, C.-P., M., Satorra, (1991). Scaled analysis: A Monte Carlo study. British Journal Mathematical Statistical 44(2), 347–357. 10.1111/j.2044-8317.1991.tb00966.x ScholarCronbach, Meehl, E. (1955). Construct tests. Bulletin, 52(4), 281–302. 10.1037/h0040957 ScholarDierickx, S., Smits, D., Corr, Hasking, P., Claes, properties brief Dutch version Reinforcement Sensitivity Theory Personality Questionnaire. Development, 20–30. 10.1027/2698-1866/a000004 articleLink, ScholarFlynn, Schaik, Wersch, (2004). Comparison multi-item Visual Analogue Scales transactionally coping European Assessment, 20(1), 49–58. 10.1027/1015-5759.20.1.49 ScholarFokkema, S. (2017). How performing PCA equals trouble: Overfitting editorial thoughts it. 33(6), 399–402. 10.1027/1015-5759/a000460 ScholarForero, C., (2009). Estimation models: Limited Methods, 14(3), 275–299. 10.1037/a0015825 ScholarGnambs, Scharl, Rohm, (2021). Comparing perceptual speed contexts: students special needs. 93–101. 10.1027/2698-1866/a000013 ScholarGreiff, Why needs start worrying fit. 33(5), 313–317. 10.1027/1015-5759/a000450 ScholarHeene, Hilbert, Draxler, Ziegler, Bühner, (2011). Masking misfit increasing unique variances: cautionary usefulness indices. 16(3), 319–336. 10.1037/a0024917 ScholarHopwood, C. B. (2010). inventories evaluated? Social Psychology Review, 332–346. 10.1177/1088868310361240 ScholarHu, (1999). Cutoff criteria Conventional new alternatives. 6(1), 1–55. 10.1080/10705519909540118 ScholarKaplan, equation modeling: Foundations extensions (2nd ed.). Sage. ScholarLawley, (1962). Royal Society: Series D, 12(3), 209–229. 10.2307/2986915 ScholarLi, H. (2016). data: squares. Behavior Research 48(3), 936–949. 10.3758/s13428-015-0619-7 ScholarLikert, (1932). technique attitudes. Archives 140, 44–53. ScholarMuthén, B., du Toit, Spisic, (1997). inference quadratic variable modeling outcomes. Retrieved https://www.statmodel.com/download/Article_075.pdf O. (1997–2017). user’s guide. Muthén. ScholarRammstedt, John, (2005). Kurzversion des Big Five Inventory (BFI-K): Entwicklung und Validierung eines ökonomischen Inventars zur Erfassung fünf Faktoren Persönlichkeit [Short validation economic inventory personality]. Diagnostica, 51(4), 195–206. 10.1026/0012-1924.51.4.195 ScholarRosseel, (2012). lavaan: R package modeling. Software, 48(2), 1–36. 10.18637/jss.v048.i02 ScholarSatorra, (1994). Corrections analysis. von EyeC. Clogg (Eds.), Latent Applications developmental (pp. 399–419). ScholarSimms, Zelazny, Williams, F., Bernstein, Does matter? Psychometric perspectives 31(4), 557–566. 10.1037/pas0000648 ScholarTakane, Y., (1987). relationship discretized Psychometrika, 52(3), 393–408. 10.1007/BF02294363 Scholarvan W. Handbook theory. Taylor Francis. 10.1201/9781315119144 ScholarYuan, K.-H., (1998). Normal 51(2), 289–309. 10.1111/j.2044-8317.1998.tb00682.x ScholarZiegler, Stop state your intentions! Let’s forget ABC construction. 30(4), 239–242. 10.1027/1015-5759/a000228 – structured why. 3–11. 10.1027/2698-1866/a000002 (2015). unidimensionality items: Pitfalls loopholes. 231–237. 10.1027/1015-5759/a000309 Scholar1Note simplify here. specificities, communalities, intercorrelations.2Item German Inventory-Short (BFI-S; Rammstedt 2005). taken ongoing format (Brauer 2022).FiguresReferencesRelatedDetails Volume 4Issue 1March 2023ISSN: 2698-1866eISSN: Published onlineFebruary 7, 2023 InformationPsychological (2023), pp. 4-12 https://doi.org/10.1027/2698-1866/a000034.© 2023The Author(s)LicensesDistributed Hogrefe OpenMind article license CC BY-NC-ND 4.0 ( https://creativecommons.org/licenses/by-nc-nd/4.0)Acknowledgments:We figures.PDF download
منابع مشابه
The Structure of Mental Health: Higher-order Confirmatory Factor Analyses of Psychological Distress and Well-being Measures
This paper addresses the question of whether psychological distress and subjective well-being are the opposite poles of the same axis of mental health or independent constructs that should be measured on two independent axes. The measures used in this study originate from a preliminary ethnosemantic study and the content analysis of narratives of psychological distress and well-being episodes e...
متن کاملFactor Analyses the Management Skills in Development of Organic Agriculture in West Azerbaijan Province Farms
Organic agriculture, sometimes called biological or ecological agriculture, combines traditional conservation-minded farming methods with modern farming technologies .The purpose of this research was to identify management skills in development of Organic Agriculture in West Azerbaijan Province (Iran) farms. The target population of this research included all agricultural experts who work in We...
متن کاملA Confirmatory Factor Analysis Approach to Test Anxiety
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Psychological test adaptation and development
سال: 2023
ISSN: ['2698-1866']
DOI: https://doi.org/10.1027/2698-1866/a000034